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ABSTRACT 

This study represented a first attempt to evaluate the impact 
of local item dependence (LID) for Item Response Theory (IRT) scoring in 
computerized adaptive testing (CAT) . The most basic CAT design and a 
simplified design for simulating CAT item pools with varying degrees of LID 
were applied. A data generation method that allows the LID among items to be 
defined was applied to generate five CAT item pools. The LID exhibited by 
these pools ranged from zero LID (complete local item independence) to extreme 
LID. Results indicate that for certain types of scoring an extreme amount of 
LID may adversely impact the final score obtained by the examinee. The 
estimated precision of the test was also affected by the extreme LID level 
studied here. For the medium level of LID, structured to display the amount of 
LID typically displayed by the Law School Admission Test, the effects of the 
LID were not troublesome. (Contains 8 figures and 36 references.) (SLD) 
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Executive Summary 



In computerized adaptive testing (CAT) an attempt is made to select items for individual test takers that are 
appropriate for their ability level. This adaptation of the difficulty level of the test to the ability level of the test 
taker is made possible through the application of item response theory (IRT). IRT is a mathematical model that 
relates the probability that a test taker will answer a single test item (i.e., test question) correctly to the ability 
level of the test taker and specific characteristics of the test item. In applying IRT, a formal assumption of local 
item independence is made. This assumption states that once the ability level of the test taker is accounted for, 
the responses of test takers to individual items on the test should be statistically independent. 

In a test-taking situation, many circumstances arise that cause the local item independence assumption to be 
violated to some degree. For instance, if a test section is especially difficult, fatigue may adversely affect the 
performance of test takers on the items at the end of the section. In this case, the difficulty level of the items 
found at the beginning of the section affect performance on later items, and so these items are said to exhibit 
some degree of local item dependence (LID). 

The impact of LID on various applications of IRT within the paper-and-pencil mode of testing has been 
evaluated. Depending on the particular test design, a computerized test may rely more heavily upon IRT for 
such procedures as item selection and ability estimation, and so the assumptions of the model become even more 
important. This study represents a first evaluation of the impact of LID for IRT scoring in CAT. As such, the 
most basic CAT design and a simplified design for simulating CAT item pools with various degrees of LID 
were applied. The results indicate that, for certain types of scoring, an extreme amount of LID may adversely 
impact the final score attained by the examinee (i.e., test taker). The estimated precision of the test was also 
affected by the extreme LID level studied here. For the medium level of LID, structured to display the amount 
of LID typically displayed by the LSAT, the effects of the LID were not troublesome. 

Future research in this area should focus on some of the computerized testing designs that are currently being 
evaluated for the LSAT. Also, future research should be carried out to evaluate LID levels that represent 
situations likely to arise in building an item pool for computerized testing. For example, the effect of 100 items 
displaying an extreme level of LID within a medium LID CAT pool should be evaluated. 

Introduction 

The local item independence assumption of item response theory (IRT) has generated a great deal of interest 
among psychometric researchers (Ackerman & Spray, 1987; Ackerman, 1987; Andrich, 1978, 1985; Bell, 
Pattison, & Withers, 1988; Chen & Thissen, 1997; Embretson, 1984; Goldstein, 1980; Jannarone, 1986, 1987, 
1991a, 1991b, 1991c, 1994; Kelderman, 1984; Kempf, 1977; Lord, 1953; Masters, 1988; Pashley & Reese, 

1999; Reese, 1995; Reese & Pashley, 1999; Rosenbaum, 1984; Spray & Ackerman, 1987; van den Wollenberg, 
1982; Wainer & Kiely, 1987; Yen, 1984, 1993). Up to this point, however, research in this area has focused 
primarily on the paper-and-pencil mode of testing. While some have alluded to the fact that local item 
dependence (LID) is an issue that will need to be addressed within the computerized adaptive testing (CAT) 
environment, little research has been carried out toward this end (one exception to this is Spray, Parshall, & 
Thomas, 1997). One reassurance that has been cited with regard to the impact of LID within the paper-and- 
pencil test is that the effects are somewhat equalized across examinees, since all are asked to respond to the 
same set of items. The effects of LID within the CAT environment are somewhat more troublesome in that 
examinees (i.e., test takers) respond to different sets of items, and any adverse impact may not be equalized 
across examinees for this testing mode. 

This research represents an initial study investigating the effects of LID within a CAT environment. A data 
generation method that allows the LID among items to be defined was applied to generate five CAT item pools. 
The LID exhibited by these item pools ranged from zero LID (complete local item independence) to extreme 
LID. The effect of the various levels of LID on scoring outcomes within a standard maximum information CAT 
were then evaluated. 
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